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Abstract 

A neural network technique is used to discriminate between quark and gluon jets produced in 
the qg q + 'y and qq —>^ g + ^ processes at the LHC. Considering the network as a trigger and 
using the PYTHIA event generator and the full event fast simulation package for the CMS detector 
CMSJET we obtain signal-to-background ratios. 



1. Introduction. 

There are two QCD processes mainly contributing to the production of a direct pho- 
ton: the Compton-hke process 

qg ^ q + j (1) 

and the annihilation process 

qq^ g + j. (2) 

It was proposed in our paper ^ to use the direct photon production processes to 
extract a gluon distribution function in a proton f^{x,Q^). It can be done by selecting 
those ''^+jet" events which satisfy the criteria pointed out in and [||] to suppress 
the next-to-leading order diagrams with initial state radiation and the background to 
the direct photon production from the neutral decay channels of vr'^, r/, K^,io mesons 
and the photons radiated from a quark in the QCD processes with big cross sections 
(like qg qg, qq qq and qq — > qq scatterings). 

A percentage of Compton-like process (1) (amounting to 100% together with 
(2)) for different transverse energy Et^^*{^ Ep) and pseudorapidity t]^^^ intervals 
are given m Table 0: 



Table 1 : A percentage of the Compton-like process qg ^ ■y + q. 



Calorimeter 
part 


Et"/ interval (GeV) 


40-50 


100-120 


200-240 


Barrel 


89 


84 


78 


Endcap-i-Forward 


86 


82 


74 



In the table above the string "Barrel" corresponds to the Barrel region of the CMS 
calorimeter {\r]\ < 1.4) while the string "Endcap-i-Forward" corresponds to the End- 
cap-i-Forward region (1.4 < \r]\ < 5.0). 

Thus, an admixture of the processes with a gluon jet in the final state grows 
from the left upper corner to the right bottom one, i.e. with a jet energy. Therefore, 
to collect a clean sample of "7 + quark jet" events sample it is necessary to reject 
"7 + gluon jet" events. This is most important in the Endcap-i-Forward region for 
jets with Et^^*^ > 100 GeV where the part of the "7 + gluon jet" events is more than 
20% and where one can reach the smallest x values of the gluon distribution function 
/nx,Q2)(see[lI|]). 

The idea of using the Artificial Neural Network (ANN) to discriminate quarks 
from gluons was widely discussed in the literature (iQ] - [||]). In [^, ^ the discrim- 
ination procedure is described for e'^e~ reactions at ^/s = 29, 92 GeV with three 
different Monte Carlo (MC) generators: JETSET, ARIADNE and HERWIG. After 
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testing with a middle point criterion the network was able to classify correctly, on the 
average, 85% of quark and gluon jets for a testing set. The MC independence of the 
results was also demonstrated by training with the MC data simulated by one gener- 
ator and by testing with the MC data from another. We also refer to [11], where MC 
independence (JETSET/HERWIG) of the quark/gluon separation procedure based on 
the moment analysis of jet particles is presented. 

In [§] the ANN was applied to a set of pp events at ^/s = 630 GeV generated 
with PYTHIA [^]. The UA2 calorimeter geometry was used there to classify quark 
and gluon jets produced in the qq —>■ qq, qq —>■ qq and gg —>■ gg QCD subprocesses 
alone. The 70 — 72% classification ability with respect to the middle point criterion 
was reached there. In this paper we use the ANN approach to get the most effective 
discrimination of quark and gluon jets in processes (1) and (2) selected by the cuts 
given in Section 3 (and earlier in [Q], [Q]). The close results were obtained in |Q] 
by using two- and three-layered network for quark/gluon jets classification in the 
pp ^ 2 jets events at ^/s = 630 GeV. 

The study was carried out using the JETNET 3.0 package [^developed at CERN 
and the University of Lund 



2. Artificial Neural Network. 

2.1 Generality and mathematical model of the neural network. 

ANNs are often used to optimize a classification (or pattern recognition) procedure 
and was applied to many pattern recognition problems in high energy physics (see 
- [p^], [17] - [19]) with a notable success. They usually have more input 
than output nodes and thus may be viewed as performing dimensionality reduction 
of input data set. 

The ANN approach is a technique which assigns objects to various classes. 
These objects can be different data types, such as a signal and a background in our 
case. Each data type is assigned to a class which in the context of the given paper 
is for the background (gluon jet) and 1 for the signal (quark jet). Discrimina- 
tion is achieved by looking at the class to which the data belongs. The technique 
fully exploits the correlation among different variables and provides a discriminating 
boundary between the signal and the background. 

ANNs have an ability to learn, remember and create relationships amongst the 
data. There are many different types of ANN but the feed forward types are most 
popular in the high energy physics. Feed forward implies that information can only 
flow in one direction and the output directly determines the probability that an event 
characterized by some input pattern vector X{xi,X2, ---Xn) is from the signal class. 

'it is available via anonymous ftp from thep.lu.se or from freehep.scri.fsu.edu. 
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The mathematical model of the Neural Network (NN) reflects three basic func- 
tions of a biological neuron: 



• sum up all the information arriving at 
inputs of the node/neuron; 

• if sum is greater than some threshold, 
fire neuron; 

• after firing, return to the initial state 
and send a signal to each of the neigh- 
boring neurons in the network. 

The neuron with these characteristics is 
known as an elementary perceptron. The per- 
ceptron is a simple feed forward system with 
several input connections and a single output 
connection. 



Output node 




Hidden nodes 

h. 



Fig. 1. Neural network with one layer 
of hidden units. 



Mathematically the output can be written as 

0{xi,X2, ...Xn) 



fif(;^ VwjXi +61). 



(3) 



T ^ 

Here 5 is a non-linear transfer function and typically takes the following form (sig- 
moid function) 

^ (4) 



9 



1 + e- 



-2x ■ 



{xi,X2, ■■■Xn) is the input pattern vector, O is the output, uji and 9 are independent 
parameters called weights (which connect the input nodes to the output node) and a 
threshold of the output node. /3 = 1/T is called inverse temperature and defines the 
slope of g. 

The pattern vector Xi is multiplied by the connection weights Ui so that each 
piece of information appears at the perceptron as WjXj. Then the perceptron sums all 
the incoming information to give J2 ^i^i ^^id applies the transfer function g to give 
the output (see (||)). 



In a feed forward NN a set of neurons has a layered structure. Figure 2. 1 shows 
the feed forwai^d the NN with one hidden layer that is used here. In this case the out- 
put of NN is 



1 1 

0(xi,X2,...2;„) = g{—ujj'^g{—'^ujjkXk + 9j) +9) 



(5) 



where ujjk is the weight connecting the input node k to the hidden node j and ujj's 



connect the hidden nodes to the output node, 
and the output node respectively. 



and 9 are the thresholds of the hidden 
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2.2 Learning of the perceptron. 

The behavior of a perceptron is determined by independent parameters known as 
weights and thresholds. The total number of independent parameters in a neural 
network with a single layer is given by: 

Nind = (Nin + Non) ■ Nhn + Nht + Not (6) 

where is a number of input nodes, Non is a number of output nodes, N^n is a 
number of nodes in a hidden single layer, N^t is a number of thresholds in a hidden 
single layer. Not is a number of output thresholds. 

Learning is the process of adjusting these Nind parameters. During learning 
every perceptron is shown examples of what it must learn to interpret. It is fulfilled 
on the training set consisting of two parts: training data (a collection of input patterns 
to the perceptron) and a training target, which is a desired output of each pattern. 

Mathematically, the goal of training is to minimize a measure of the error. The 
mean squared error function E averaged over the training sample is defined by equa- 
tion © N 

^=^EE(o?^-^l'^)^ (7) 
V p=i i=i 

where Oi is the output of the ith node of the NN in equation (^; ti is the training 
target (in our case, for the background and 1 for the signal); Np is the number of 
patterns (events) in the training sample; N is the number of network outputs (N = 1 
for our case). 

There are several algorithms for error minimization and weight updating. Most 
popular are Back propagation, Langevin and Manliattan methods. In the last one 
the weight is updated during the learning by the following rule [|: 

uJt+i = ujt + Auj (8) 

Aw = —r] ■ sgn[dE/duj] (9) 

where uj is the vector of weights and thresholds used in the network; t {t + \) refers 
to the previous (current) training cycle and -q is the learning rate which is decreased 
in the learning process. 



3. Event selection and Monte Carlo simulations for the ANN analysis. 

Our selection conditions for "7 + jet" events are based on the selection rules chosen 
m If] and [||]. We suppose the electromagnetic calorimeter (ECAL) size to be limited 
by |r/| < 2.61 and the hadronic calorimeter (HCAL) is limited by |?7| < 5.0 (the CMS 
geometry; see [14] and [15]), where t] = —ln{tan{9 /2)) is a pseudorapidity [| defined 

^see 1^] for a more complete description 

''not to be confused with the learning rate also designated by ri 
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through a polar angle counted from the beam line. In the plane transverse to the 

-* jet -* 7 

beam line the azimuthal angle (p defines the directions of Et and Et . 

1. We select the events with one jet and one photon candidate with 

Et^ > 40 GeV and Et^^* > 30 GeV. (10) 

A jet is defined here according to the PYTHIA jetfinding algorithm LUCELL [[ The 
jet cone radius R in the — ^ space is taken as R= ((Ary)^ + (A0)^)^/^ = 0.7. 

2. Only the events with "isolated" photons are taken to suppress the background 
processes. To do this, we 

a) restrict the value of the scalar sum of Et of hadrons and other particles sur- 
rounding a photon within a cone of Rl^^ = ((Ar?)2 + (A(/.)2)i/2 = 0.7 ("absolute 
isolation cut") 

J2Et'^Et''°'<Et^ljT; (11) 

b) restrict the value of a fraction ("relative isolation cut") 

J2EtyEt'^^e''<eluT; (12) 

c) accept only the events having no charged tracks (particles) with Et > I GeV 
within the R]ggi cone around a photon candidate. 

3. We consider the structure of every event with the photon candidate at a more pre- 
cise level of the 5x5 crystal cells window (size of one CMS HCAL tower) with a cell 
size of 0.0175x0.0175. To suppress the background events with the photons result- 
ing from high-energy ir^, r], u) and mesons we require that 

either (al) there is no high Et hadron in this 5x5 crystal cells window (at the PYTHIA 
level of simulation): 

^hadr < 5 Q^y ^yy^ 

or (a2) the transverse energy deposited in HCAL in the radius i? = 0.7 counted 
from the center of gravity of the HCAL tower just behind the ECAL 5x5 window, 
containing a direct photon signal, to be limited by {at the level of the full event simu- 
lation; see below) : 

^HCAL < ^ Q^y (j4) 
■^PYTHIA's default jetfinding algorithm 

^^At the PYTHIA level of simulation this cut may effectively take into account the imposing of an 
upper cut on the HCAL signal in the tower behind the ECAL 5x5 crystal cells window hitted by the 
direct photon (see pc|]). 
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4. The events with the vector Et being "back-to-back" to the vector Et within 
A(j) in the plane transverse to the beam line with A0 defined by equation: 

= 180° ± A(/> (A(/> = 15°, 10°, 5°) (15) 
(5° is the size of one CMS HCAL tower in 0) for the following definition of the angle 
cP^^j^ty- Et^Et'"' = Et^Et^^' ■ cos((/.(^,,-et)) with Et^ = \Et"'\, Et^^' = \Et'"'\. 

5. To discard more the background events, we choose only the events that do not 
have any other (except one jet) minijet-like or cluster high Et activity with the Et'^''^^^ 
higher than some threshold Et^^lfj^. Thus we select events with 

where clusters are found by the same jetfinder LUCELL used to find a jet in the same 
event. 

The following values of cut parameters were used here: 

Et^'^T = ^GeV; e^-^^r = 7%; A</. < 15°; Et^^f^ = W GeV. (17) 

To obtain the results of this paper we used two types of the generations: 

(a) by PYTHIA alone, based on the averaged calorimeter cell sizes Arj x A0: 0.087 x 
0.087 in the Barrel, 0.134 x 0.174 in the Endcap and 0.167 x 0.174 in the Forward 
parts; 

(b) by CMSJET - the full-event fast Monte Carlo simulation package for a response 



in the CMS detector [ |13| ] with the switched on calorimeter and magnetic field effects. 

The following Et^ intervals were considered for both types of generations: 
40 < Et''' < 50, 100 < Et''' < 120 and 200 < Et"^ < 240 GeV. Besides, for every 
Et"' interval we separate the regions to which the jet belongs: Barrel {\r]^'^^\ < 1.4) 
and Endcap-i-Forward (1.4 < < 4.5). Since the jet is a spatially spread object, 
some energy leakage from one calorimeter part to another is possible. To distinguish 
cases when a jet is in the Barrel or in the Endcap-i-Forward regions the following re- 
striction was added to cuts 1 — 5: 

AE^t^'^V^t^''* = - for the PYTHIA level study; (18) 
AEi^^/Et^^^ < 0.05 - for the CMSJET level study. (19) 

Here AEt^^^ is the jet Et leakage from that part of the calorimeter in which the jet 
gravity center was found. 

4. Training and testing of ANN. 

There are two stages in the neural network analysis. The first is training of the net- 
work and the second is testing. NN is trained with samples of signal and background 
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events and tested by using independent data sets. Training of the network corre- 
sponds to step-by-step changing of the weights iOjk such that a given input vector 
X^P\xi, X2, Xn) produces an output value O^^^ that equals the desired output or 
target value t^^^ (see (5) and (7)). 

The input parameters used in the 0th (input) layer of the network (Fig. 1) were 
chosen as follows. In "Set 01" and "Set 02" we analyzed the jet information obtained 
in PYTHIA simulation. In Set 01 we assigned Et, rj and (p of the first Et leading 
cell to the nodes xi, X2 and x^ respectively. Then we took the second leading cell 
and assign its Et, rj and (j) to the nodes X4, and xg. The same was done for the 
remaining 13 cells. So, we had 45 input nodes in total ^. In Set 02 we added 46th 
input node with a number of charged tracks Nt^ack inside a jet with Et'^^ > 1 GeV. 
For "Set 1" and "Set 2" we repeated the previous procedure but with respect to the 
cells of jets found after the fast Monte Carlo simulation of the whole event by using 
CMSJET. Analogously, we had 45 and 46 {+Ntrack information) input nodes for Sets 
I and 2. 

To ensure convergence and stability, the total number of training patterns (events) 
must be significantly (20 — 30 times) larger than the number of independent parame- 
ters (see (^). About 7000 signal (with a quark jet) and background (with a gluon jet) 
events were chosen for the training stage, i.e. about 30 patterns per a weight. 




0.1 0,2 0,3 0.4 0.5 0.6 0,7 0,8 0.9 1 

UN output 



Fig. 2: Neural network output for quark and gluon jets that were found in the Endcap+Forward region, 

40<£;t^<50 GeV. 

After the NN was trained, a test procedure was implemented in which the 
events not used in the training were passed through the network. The same propor- 
tion of the signal and background events (about 7000 of each sort) was used at the 

*This input set is the same as in |^. It was checked out that variation down to 10 or up to 20 cells 
data at the input do not much affect the result. 
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generalization stage. An output was provided for each event and could be considered 
as a probability that an event is either from signal or background sample. If the train- 
ing is done correctly, the probability for an event being signal is high if the output O 
is close to 1. And conversely, if the output O is close to 0, it is more likely to be a 
background event (see Fig. ^for the case of jets found in the Endcap+Forward region 
and 40 < < 50 GeV as an example of a typical NN output). 

5. The choice of neural network architecture and learning parameters. 

To investigate dependence of the separation possibility on the learning parameters, 
we trained a neural network with 7000 signal events and 7000 background events 
found after the CMSJET simulation. In those events the direct photon Et was chosen 
to be 100 < Ep < 120 GeV and jets were found in the Barrel region. 

The network was tested with an independent set of 7000 signal and background 
events. Sensitivity to different NN parameters was tested from the point of view of 
the NN quark/gluon separation probability with respect to the "0.5 -criterion" (point 

0.5 of the NN output). These parameters are listed below and the corresponding plots 
are given in Figs. ^ and ^. 

• Number of training cycles 

We varied the number of training cycles from 100 to 1000 to investigate the 
effect of training on the network performance. The result shown in Fig. ^ 
indicates the network stability if more than 200 training cycles are used. 

• Inverse temperature 

The inverse temperature determines the steepness of the transfer function g{x) 
(^. On the left-hand upper plot of Fig. ^ the quark/gluon separation probability 
drops by 1% as one goes from j3 = 0.5 — 1 to /? = 1.5 — 1.8. 

• Number of hidden nodes 

One hidden layer is used here because it is sufficient for most classification 
problems [Q]. Sensitivity of the quark/gluon separation probability to a number 
of hidden nodes Nh was tested with Nh = 3 — 15. All resulting points fall 
within 1% (71 - 72%) window (see Fig. 0) [} 

• Learning rate r] 

The learning rate r/ is a factor in updating the weights. We varied its value 
between 0.0001 and 0.05 (see left-hand bottom plot in Fig. ^. The value r] = 
0.005 was chosen for our analysis. 

• Scale parameter "j scale 

The optimal learning rate rj varies during learning while the network converges 
towards the solution. The scale factor for its changing is determined by the 

'To be exact, a bit better result is aciiieved with A^^ = 11. 
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Fig. 3: The quark/gluon separation probability using "0.5-criterion" as a function of the number of 
training cycles N train ■ 





Fig. 4: The quark/gluon separation probability using the "0.5-criterion" criterion for various network 
parameters: inverse temperature (3 = 1/T, the number of hidden nodes A^;,, learning rate 77, 77 updating 
scale parameter ^ scale- 



parameter j scale- The right-hand bottom plot in Fig. ^ shows that the optimal 
performance is achieved at the default value j scale = — 1- 

As was mentioned above, the Manhattan updating method was used here dur- 
ing the training procedure. In Table ^ for the case of jets in the Barrel region 
and 100 < Ep < 120 GeV (Set 2) this method is compared with other updating algo- 
rithms with various values of learning parameters: learning rate t] (Backpropagation, 
Langevin) and noise term a (Langevin). It is seen that by varying t] and a from their 



Table 2: A dependence of the separation probability (%) using "0.5 -criterion" on the method. CMSJET, 
Set 2, Barrel region, 100 < Et"' < 120 GeV. 



Method 


B ackpropagation 


Langevin 


Parameters 


1] = !. 


7? = 0.5 


?? = 0.1- 
r/ = 0.001 


7?=1.0 
(7 = 0.01 


77 = 0.1-0.01 
cr = 0.01 


77 = 0.01 
cj = 0.001 


Probab.(%) 


51 


68 


71 


69 


70 


71 



default values in the JETNET package (the first column for each algorithm) one can 
approximately reach the value of the separation probability obtained by using the 
Manhattan algorithm (72%). 

6. Description of the results. 

As an example of the "0.5-criterion" application. Table ^ presents the discrimination 
powers obtained after the simulation at the PYTHIA level and events selection ac- 
cording to the cuts (10) — (18) of Section 3 for three various intervals of the direct 
photon Et. 

Table 3: The quark/gluon separation probability (%) using "0.5-criterion'". Barrel and Endcap+Forward 
regions. PYTHIA level simulation. 



Simulation 
type 


Set 
No. 


EP interval (GeV) 


40-50 


100 - 120 


200 - 240 


Barrel 


01 


74 


76 


79 


02 


75 


77 


82 


Endcap-i- 
Forward 


01 


70 


69 


69 


02 


73 


74 


75 



The error is of order of 1.5 — 2% for all numbers in the table above. 



We see that by using the "0.5-criterion" for the ANN output, when the output 
node value O > 0.5 is interpreted as a quark jet and O < 0.5 as a gluon jet, the 
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network correctly classifies 75 - 82% (73 - 75%) of jets at tlie PYTHIA level in the 
Barrel (Endcap+Forward) region with the input data that correspond to Set 01 and 
Set 02 (see Section 4). The separation probability is seen to grow by 1 — 3% after 
introducing the information on the number of tracks Nf^ack in the Barrel region. The 
analogous increase for the Endcap+Forward region is 3 — 6%. 

To give an understanding of such an improvement we plot, as an example, a 
distribution of the number of events over the number of tracks with Et > 1 GeV 
in quark and gluon jets, i.e. N^^^^j^ and Nf^^^^, for 40 < < 50 GeV and 
200 < Et'^ < 240 GeV in the Endcap+Forward region (Fig. H) 0. Due to the 
larger probability of bremsstrahlung from a gluon than from a quark we obtain the 
(Kack) / (Kack) ratio equal to 1.27 for 40 < Et^ < 50 GeV and 1.46 for 
200 < Et^ < 240 GeV. 



n 0.03 

0.025 

z 
-o 

* 0.02 
0,015 
0.01 
0,005 



. quark jet 
. gluon jet 



*> = 7,5 




(o) 



10 



15 




40 



N„ 



Fig. 5: Distribution over the number of charged tracks with Et'^'^ > 1 GeV for jets found in the 
Endcap+Forward region: 40 < Et'' < 50 GeV (a) and 200 < £ J < 240 GeV (b). 

Figures ^-|8| obtained after the full event simulation with the help of CMS JET 
also explain the choice of the variables at the input to NN in Section 4. As is seen 
from Figs. ^ and 0, Et of the leading cell {"Etl" in the plots) in a quark jet is, on 
the average, 25 — 30% greater than in a gluon jet. The difference in Et for the next- 
to-leading cells ("i?t2" on the plots) in quark and gluon jets is about 10 — 20% (it is 
smaller for jets with a higher Et). Et of a complete quark jet is also greater than Et 
of a complete gluon jet (by 4 — 10%). Again, the difference becomes smaller with 
growing jet Et. 

Figure ^ shows a distribution of the averaged Et in quark and gluon jets over 
the distance from the jet Et leading cell R^ic for all Et'^ intervals and calorimeter 
regions considered in this paper. One can note that in all cases the averaged Et in a 

*For a comparison, see also quark and gluon jet multiplicities found in experiments at DELPHI pl[l, 
OPAL iH and DO ||23ll collaborations. 
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quark jet up to R^ic 0.12 — 0.14 is greater than in a gluon jet and, vice versa, the 
averaged Et in a quark jet for R^ic > 0.14 is lower than in a gluon jet. 

It is more useful for practical applications to investigate the Signal/Background 
ratios for different the NN output thresholds [[ This analysis was done after the full 
simulation with CMSJET and event selection according to cuts (10) — (19). 

The Signal/Background ratios corresponding to the "Set 2" input NN informa- 
tion are given in Table ^ for three Ep intervals and two calorimeter regions. As a 
complement to Table Q in Fig. |l^ shows the quark selection and gluon rejection effi- 
ciencies in the case of the full simulation for the same Ep intervals and calorimeter 
regions. 



Table 4: Signal/Background. The full event simulation using CMSJET. Set 2. 



EP 
{GeV) 


Region 


NN output cut 


0.3 


0.4 


0.5 


0.6 


0.7 


0.8 


40-50 


Barrel 


1.45 


1.91 


2.40 


3.11 


4.19 


6.16 


Endcap-i-Forward 


1.41 


1.81 


2.38 


3.10 


4.04 


5.85 


100 - 120 


Barrel 


1.72 


2.63 


3.26 


4.04 


4.59 


6.37 


Endcap-i-Forward 


1.75 


2.19 


2.95 


3.61 


4.21 


5.41 


200 - 240 


Barrel 


1.76 


2.37 


3.35 


4.26 


5.56 


7.36 


Endcap-i-Forward 


1.64 


2.40 


3.17 


4.16 


5.39 


7.45 



The Signal/Background ratio grows both with growing NN output threshold 
value and with increasing Ep value (see Table ^. So, it grows from 2.4 to 3.2 at 
the NN output cut O > 0.5 and from 4.0 to 5.4 at O > 0.7 for the Endcap-i-Forward 
region. The curves in Fig. |l^ show that for the last cut (O > 0.7) about 38% and 
44% of the events with the quark jet are selected for 40 < Et'^ < 50 GeV and 200 < 
Ep < 240 GeV, respectively, while about 66 — 67% of the events with quark jet are 
selected at O > 0.5 for the both Ep intervals and the same calorimeter region. 

The Signal/Background ratio dependence on the NN output cut at the PYTHIA 



level is presented in Figs. [12| and [13 



It is also important for practical realizations to know a dependence of the Sig- 
nal/Background ratios on the quark jet selection efficiencies. This dependence is 



plotted in Fig. 11 for two extreme considered in this paper intervals Ep and two 
calorimeter regions. We present two curves obtained with Set 1 and Set 2 of input 
information after the full CMSJET event simulation (thin and thick solid lines) and 
one curve (dotted line) obtained with Set 02 after event simulation at the PYTHIA 
level. 



'not only for the point 0.5 as in Table 3 above 
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Fig. 6: Distribution over Et of leading cell (Etl), Et of next-to-leading cell {Et2) and Et of the full 
quark and gluon jets. CMSJET, Endcap-i-Forward, 40 < Et'' < 50 GeV. 
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Fig. 7: Distribution over Et of leading cell (Etl), Et of next-to-leading cell {Et2) and Et of the full 
quark and gluon jets. CMS JET, Endcap-i-Forward, 100 < < 120 GeV. 



14 






Fig. 8: Distribution of Et over tlie distance R-ic (in the rj — (p space) from the initiator cell inside 
quark (solid line) and gluon (dashed line) jets. The left-hand column corresponds to the Barrel region 
and the right-hand to the Endcap-i-Forward region. The CMSJET simulation. 
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7. Some additional remarks. 



The results obtained with the quark and gluon jets found in the CMSJET simula- 
tion were compared with the results obtained after passing the quark and gluon jet 
particles through the electromagnetic (ECAL) and hadronic (HCAL) calorimeters in 
the CMSIM package [p^]. The discrimination probabilities obtained after the cell 
analysis in CMSIM are found to be in good agreement (up to 1 — 2%) with those ob- 
tained in CMSJET. It was also found that almost the same discrimination powers can 
be achieved both in CMSET and in CMSIM by using the network input information 
about Et of the first, £^t-ordered 15 ECAL and 15 HCAL cells (i.e. 30 input nodes) 
instead of 45 input nodes as considered above (see Section 4). 
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Fig. 9: Distribution over the number of ECAL (plots la and 2a) and HCAL jet cells (plots lb and 2b), 
iVfif and iV^f , for jets found in the Barrel region: 40 < Et'' < 50 GeV (la, lb) and 200 < Et'' < 
240 GeV (lb, 2b). 

The sensitivity of the network to some parameters is also noteworthy. So, the 
network is able to classify correctly quark and gluon jets with respect to the "0.5- 
criterion" in 65% (67%) of events with 40 < Et'^ < 50 GeV (200 < EP < 240 GeV) 
by using the Nt^ack variable alone. These results can be improved by 2 — 3% if we 
also add to Nfj-ack two more input variables: the numbers of activated cells (towers) 
in the ECAL and the HCAL belonging to quark and gluon jets. The distributions over 
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those numbers are shown in Fig. ^ for two E{' intervals: 40 < Ep < 50 GeV and 
200 < Et'^ < 240 GeV. We see that mean number of the activated cells in the ECAL 
for the case of gluon jets (Nj^-^) exceeds that for the case of quark jets ( A^^j^) by a 
factor of 1.23 for 40 < Ef < 50 GeV. This difference grows up to the factor of 1.28 
for 200 < Et'^ < 240 GeV. And in both intervals the ratio of the mean numbers of the 
activated cells {Nflit)/{N-l^^) in the HCAL is about 1.16 - 1.17. 
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Fig. 10: Quark jet selection and gluon jet rejection efficiencies as a function of neural network output 
cut. Left-hand (la, 2a, 3a) and right-hand (lb, 2b, 3b) columns correspond to the Barrel and the 
Endcap+Forward regions respectively. The first row plots (la, lb) are distributions for events selected 
with 40 < Ef < 50 GeV, in the second (2a, 2b) with 100 < Et'' < 120 GeV and in the third (3a, 3b) 
with 200 < < 240 GeV. Set 2. 
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Fig. 11: Signal to background ratio via quark jet selection efficiency. The left-hand column (la, lb) 
correspond to the events with jets found in the Barrel and the right-hand (2a, 2b) correspond to the 
events with jets found in the Endcap+Forward region. In the first row (la, 2a) are distributions for 
events selected with 40 < < 50 GeV and in the second with 200 < Ei< < 240 GeV. 
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Fig. 12: Signal/Background ratio as a function of the NN output threshold value at the PYTHIA level. 
Barrel region. 



21 



Barrel PYTHIA level 




I \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 

0,2 0,3 0.4 0.5 0.6 0.7 0.8 

NN output cut 







100<Pt' 


<120 G< 


fV/c 














'^'LS-—^ — ^ 






1 1 1 


1 1 1 


^ 1 

1 1 1 


r^„^— 

1 1 1 


1 1 1 


1 1 1 





0.2 0.3 0.4 0.5 0.6 0,7 0,8 



NN output cut 




0,2 0.3 0.4 0.5 0.6 0.7 0.8 



NN output cut 

Fig. 13: Signal/Background ratio as a function of the NN output threshold value at the PYTHIA level. 
Endcap+Forward region. 
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